Explaining Similarity of Terms

نویسندگان

  • Vishnu Vyas
  • Patrick Pantel
چکیده

Computing the similarity between entities is a core component of many NLP tasks such as measuring the semantic similarity of terms for generating a distributional thesaurus. In this paper, we study the problem of explaining post-hoc why a set of terms are similar. Given a set of terms, our task is to generate a small set of explanations that best characterizes the similarity of those terms. Our contributions include: 1) an information-theoretic objective function for quantifying the utility of an explanation set; 2) a survey of psycholinguistics and philosophy for evidence of different sources of explanations such as descriptive properties and prototypes; 3) computational baseline models for automatically generating various types of explanations; and 4) a qualitative evaluation of our explanation generation engine.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Case-Based Reasoning for Explaining Probabilistic Machine Learning

This paper describes a generic framework for explaining the prediction of probabilistic machine learning algorithms using cases. The framework consists of two components: a similarity metric between cases that is defined relative to a probability model and an novel case-based approach to justifying the probabilistic prediction by estimating the prediction error using case-based reasoning. As ba...

متن کامل

Explaining Probabilistic Fault Diagnosis and Classification Using Case-Based Reasoning

This paper describes a generic framework for explaining the prediction of a probabilistic classifier using preceding cases. Within the framework, we derive similarity metrics that relate the similarity between two cases to a probability model and propose a novel case-based approach to justifying a classification using the local accuracy of the most similar cases as a confidence measure. As basi...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

Measuring the Similarity of Trajectories Using Fuzzy Theory

In recent years, with the advancement of positioning systems, access to a large amount of movement data is provided. Among the methods of discovering knowledge from this type of data is to measure the similarity of trajectories resulting from the movement of objects. Similarity measurement has also been used in other data mining methods such as classification and clustering and is currently, an...

متن کامل

Explaining the dimensions and components of the evaluation model of human resource management system with a strategic approach (Case study of Iran's petrochemical industry)

The human resources system is a source of advantage in organizations that these resources must be continuously evaluated to become a real source of advantage. The aim of this study was to identify a model for evaluating the human resource management system with a strategic approach for petrochemical companies. In terms of purpose, this research is applied in terms of qualitative data and in ter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008